Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a safety check to prevent large files from being fetched entirely, which could cause token limit errors. The implementation adds a 1000-line threshold before returning file content.
- Adds a size check that returns an error message for files exceeding 1000 lines
- Directs users to use the
get_file_linestool for large files instead
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if not source_path or not source_path.exists(): | ||
| return f"Invalid {owner} and {repo}. Check that the input is correct or try to fetch the repo from gh first." | ||
| lines = get_file(source_path, path) | ||
| if len(lines) > 1000: |
There was a problem hiding this comment.
The magic number 1000 should be extracted as a named constant at the module level. This makes the limit configurable and self-documenting. Consider defining something like MAX_FILE_LINES = 1000 at the top of the file with the other constants.
| if len(lines) > 1000: | ||
| return f"File {path} in {owner}/{repo} is too large to display ({len(lines)} lines). Please fetch specific lines using get_file_lines tool." |
There was a problem hiding this comment.
The size check happens after get_file has already loaded all lines into memory (line 108). For very large files, this could still cause memory issues or performance problems before the check is performed. Consider checking the file size or line count before fully loading the content, or use a streaming approach that stops reading once the limit is exceeded.
See below for a potential fix:
# Stream file lines to avoid loading large files into memory
lines = []
found = False
filename = strip_leading_dash(path)
with zipfile.ZipFile(source_path) as z:
for entry in z.infolist():
if entry.is_dir():
continue
if remove_root_dir(entry.filename) == filename:
found = True
with z.open(entry, 'r') as f:
for i, line in enumerate(f):
if i >= 1000:
return f"File {path} in {owner}/{repo} is too large to display ({i} lines). Please fetch specific lines using get_file_lines tool."
lines.append(f"{i+1}: {line.rstrip()}")
break
if not found:
return f"Unable to find file {path} in {owner}/{repo}"
…file exit gracefully with large file
To avoid large token limit error when fetching file.